Maps with Markers <a id="6"></a>
• 5 min read
import numpy as np # useful for many scientific computing in Python
import pandas as pd # primary data structure library
#!pip install folium
import folium
print('Folium installed and imported!')
Folium installed and imported!
Let's download and import the data on police department incidents using pandas read_csv() method.
Download the dataset and read it into a pandas dataframe:
df_incidents = pd.read_csv('denver_crime.csv')
print('Dataset downloaded and read into a pandas dataframe!')
Dataset downloaded and read into a pandas dataframe!
df_incidents.head()
| INCIDENT_ID | OFFENSE_ID | OFFENSE_CODE | OFFENSE_CODE_EXTENSION | Category | OFFENSE_CATEGORY_ID | FIRST_OCCURRENCE_DATE | LAST_OCCURRENCE_DATE | REPORTED_DATE | INCIDENT_ADDRESS | X GEO | Y GEO | X | Y | var | PRECINCT_ID | NEIGHBORHOOD_ID | IS_CRIME | IS_TRAFFIC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2.021224e+09 | 2.020000e+15 | 2202 | 0 | burglary-residence-by-force | burglary | 4/18/2021 22:30 | 4/19/2021 5:00 | 4/21/2021 15:25 | 300 W 11TH AVE | 3142828.0 | 1692472.0 | -104.992161 | 39.733543 | 6.0 | 611.0 | civic-center | 1 | 0 |
| 1 | 2.021225e+09 | 2.020000e+15 | 2404 | 0 | theft-of-motor-vehicle | auto-theft | 4/21/2021 23:25 | NaN | 4/22/2021 0:01 | 5700 BLK W DARTMOUTH AVE | 3124936.0 | 1664570.0 | -105.056261 | 39.657203 | 4.0 | 423.0 | bear-valley | 1 | 0 |
| 2 | 2.021601e+10 | 2.020000e+16 | 2399 | 0 | theft-other | larceny | 3/22/2021 12:51 | 3/22/2021 12:51 | 4/21/2021 22:13 | 3412 N HUMBOLDT ST | 3149191.0 | 1703917.0 | -104.969299 | 39.764862 | 2.0 | 211.0 | cole | 1 | 0 |
| 3 | 2.021601e+10 | 2.020000e+16 | 2305 | 0 | theft-items-from-vehicle | theft-from-motor-vehicle | 4/21/2021 12:00 | 4/21/2021 12:05 | 4/21/2021 13:17 | 1900 BLK S CLARKSON ST | 3146781.0 | 1673727.0 | -104.978488 | 39.682023 | 3.0 | 313.0 | platt-park | 1 | 0 |
| 4 | 2.021802e+10 | 2.020000e+16 | 2404 | 0 | theft-of-motor-vehicle | auto-theft | 3/9/2021 12:01 | 4/21/2021 12:20 | 4/21/2021 12:20 | 24050 E 78TH AVE | 3223419.0 | 1730557.0 | -104.704438 | 39.836504 | 7.0 | 759.0 | dia | 1 | 0 |
So each row consists of 13 features:
- IncidntNum:Incident Number> 2. Category:Category of crime or incident> 3. Descript:Description of the crime or incident> 4. DayOfWeek:The day of week on which the incident occurred> 5. Date:The Date on which the incident occurred> 6. Time:The time of day on which the incident occurred> 7. PdDistrict:The police department district> 8. Resolution:The resolution of the crime in terms whether the perpetrator was arrested or not> 9. Address:The closest address to where the incident took place> 10. X:The longitude value of the crime location > 11. Y:The latitude value of the crime location> 12. Location:A tuple of the latitude and the longitude values> 13. PdId:The police department ID
Let's find out how many entries there are in our dataset.
df_incidents.dropna(inplace=True)
df_incidents.shape
(156004, 19)
So the dataframe consists of 150,500 crimes, which took place in the year 2016. In order to reduce computational cost, let's just work with the first 100 incidents in this dataset.
limit = 10000
df_incidents = df_incidents.iloc[0:limit, :]
Let's confirm that our dataframe now consists only of 100 crimes.
df_incidents.shape
(10000, 19)
Now that we reduced the data a little bit, let's visualize where these crimes took place in the city of San Francisco. We will use the default style and we will initialize the zoom level to 12.
latitude = 39.7392
longitude = -104.9903
sanfran_map = folium.Map(location=[latitude, longitude], zoom_start=12)
# display the map of San Francisco
sanfran_map
Now let's superimpose the locations of the crimes onto the map. The way to do that in Folium is to create a feature group with its own features and style and then add it to the sanfran_map.
incidents = folium.map.FeatureGroup()
# loop through the 100 crimes and add each to the incidents feature group
for lat, lng, in zip(df_incidents.Y, df_incidents.X):
incidents.add_child(
folium.features.CircleMarker(
[lat, lng],
radius=5, # define how big you want the circle markers to be
color='yellow',
fill=True,
fill_color='blue',
fill_opacity=0.6
)
)
# add incidents to map
sanfran_map.add_child(incidents)
You can also add some pop-up text that would get displayed when you hover over a marker. Let's make each marker display the category of the crime when hovered over.
incidents = folium.map.FeatureGroup()
# loop through the 100 crimes and add each to the incidents feature group
for lat, lng, in zip(df_incidents.Y, df_incidents.X):
incidents.add_child(
folium.features.CircleMarker(
[lat, lng],
radius=5, # define how big you want the circle markers to be
color='yellow',
fill=True,
fill_color='blue',
fill_opacity=0.6
)
)
# add pop-up text to each marker on the map
latitudes = list(df_incidents.Y)
longitudes = list(df_incidents.X)
labels = list(df_incidents.Category)
for lat, lng, label in zip(latitudes, longitudes, labels):
folium.Marker([lat, lng], popup=label).add_to(sanfran_map)
# add incidents to map
sanfran_map.add_child(incidents)
Isn't this really cool? Now you are able to know what crime category occurred at each marker.
If you find the map to be so congested will all these markers, there are two remedies to this problem. The simpler solution is to remove these location markers and just add the text to the circle markers themselves as follows:
sanfran_map = folium.Map(location=[latitude, longitude], zoom_start=12)
# loop through the 100 crimes and add each to the map
for lat, lng, label in zip(df_incidents.Y, df_incidents.X, df_incidents.Category):
folium.features.CircleMarker(
[lat, lng],
radius=5, # define how big you want the circle markers to be
color='yellow',
fill=True,
popup=label,
fill_color='blue',
fill_opacity=0.6
).add_to(sanfran_map)
# show map
sanfran_map
The other proper remedy is to group the markers into different clusters. Each cluster is then represented by the number of crimes in each neighborhood. These clusters can be thought of as pockets of San Francisco which you can then analyze separately.
To implement this, we start off by instantiating a MarkerCluster object and adding all the data points in the dataframe to this object.
from folium import plugins
# let's start again with a clean copy of the map of San Francisco
sanfran_map = folium.Map(location = [latitude, longitude], zoom_start = 12)
# instantiate a mark cluster object for the incidents in the dataframe
incidents = plugins.MarkerCluster().add_to(sanfran_map)
# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(df_incidents.Y, df_incidents.X, df_incidents.Category):
folium.Marker(
location=[lat, lng],
icon=None,
popup=label,
).add_to(incidents)
# display map
sanfran_map